Abstract:When the existing automatic portrait coloring algorithms are directly applied to scene sketches, distortion phenomena are caused, such as wrong colorization and checkerboard artifacts,due to the diversified line semantics of scene sketches. To address this issue, an automatic colorization algorithm with anime effect for scene sketches is put forward. The structure of U-Net generator in the existing automatic portrait coloring algorithms is improved and enhanced based on the conditional generative adversarial network. A double-layer information extraction U-Net(DIEU-Net) is designed for automatic anime effect colorization of scene sketches. Firstly, the double-convolution sub-module prominence-information extraction of a scene sketch(IESS) is designed. Then, a module integrating double-layer IESS and residual structure is inserted into different stages of the proposed generator. Thus, the global learning ability of the generator on important features, like colors and positions related to the sketch, are enhanced, and the network degradation problems caused by vanishing gradients as the network deepens, are alleviated. Moreover, the deconvolution in U-Net is replaced by the operations of convolution and upsample to suppress the occurrence of the checkerboard artifacts. Experimental results show that the proposed algorithm performs well in avoiding the distortion phenomenon and achieves more reasonable and natural coloring effect than other algorithms. Furthermore, the proposed algorithm can be applied to automatic anime coloring of various types of scene sketches.
[1] WELSH T, ASHIKHMIN M, MUELLER K. Transferring Color to Greyscale Images. ACM Transactions on Graphics, 2002, 21(3): 277-280. [2] HORIUCHI T. Estimation of Color for Gray-Level Image by Probabilistic Relaxation // Proc of the International Conference on Pattern Recognition. Washington, USA: IEEE, 2002:867-870. [3] SYKORA D, DINGLIANA J, COLLINS S, et al. LazyBrush: Flexible Painting Tool for Hand-drawn Cartoons. Computer Graphics Forum, 2009, 28(2): 599-608. [4] 蔡宇文,盛 斌,马利庄.优化分割的手绘图像彩色化技术.计算机辅助设计与图形学学报, 2013, 25(6): 774-781. (CAI Y W, SHENG B, MA L Z. Hand-Drawn Image Colorization Based on Optimized Segmentation. Journal of Computer-Aided Design and Computer Graphics, 2013, 25(6): 774-781.) [5] ZHANG L M, JI Y, LIN X, et al. Style Transfer for Anime Sketches with Enhanced Residual U-Net and Auxiliary Classifier GAN // Proc of the 4th IAPR Asian Conference on Pattern Recognition. Washington, USA: IEEE, 2017: 506-511. [6] SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-Scale Image Recognition[C/OL]. [2020-05-23]. https://arxiv.org/pdf/1409.1556.pdf. [7] ISOLA P, ZHU J Y, ZHOU T H, et al. Image-to-Image Translation with Conditional Adversarial Networks // Proc of the IEEE Confe-rence on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2017: 1125-1134. [8] ZHANG L M, LI C Z, WONG T T, et al. Two-Stage Sketch Colorization. ACM Transactions on Graphics, 2018, 37(6). DOI: 10.1145/3272127.3275090. [9] CI Y Z, MA X Z, WANG Z H, et al. User-Guided Deep Anime Line Art Colorization with Conditional Adversarial Networks // Proc of the 26th ACM International Conference on Multimedia. New York, USA: ACM, 2018: 1536-1544. [10] ARJOVSKY M, CHINTALA S, BOTTOU L. Wasserstein Generative Adversarial Networks // Proc of the 34th International Confe-rence on Machine Learning. New York, USA: ACM, 2017: 214-223. [11] LIU Y F, QIN Z C, WAN T, et al. Auto-Painter: Cartoon Image Generation from Sketch by Using Conditional Wasserstein Generative Adversarial Networks. Neurocomputing, 2018, 311: 78-87. [12] MAHENDRAN A, VEDALDI A. Understanding Deep Image Re-presentations by Inverting Them // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2015: 5188-5196. [13] RONNEBERGER O, FISCHER P, BROX T. U-Net: Convolutio-nal Networks for Biomedical Image Segmentation // Proc of the International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin, Germany: Springer, 2015: 234-241. [14] ODENA A, Dumoulin V, OLAH C, et al. Deconvolution and Checkerboard Artifacts. Distill, 2016, 1(10). DOI: 10.23915/distill.00003. [15] 蔡雨婷,陈昭炯,叶东毅.基于双层级联GAN的草图到真实感图像的异质转换.模式识别与人工智能, 2018, 31(10): 877-886. (CAI Y T, CHEN Z J, YE D Y. Bi-level Cascading GAN-Based Heterogeneous Conversion of Sketch-to-Realistic Images. Pattern Recognition and Artificial Intelligence, 2018, 31(10): 877-886.) [16] JOHNSON J, ALAHI A, LI F F. Perceptual Losses for Realtime Style Transfer and SuperResolution // Proc of the European Confe-rence on Computer Vision. Berlin, Germany: Springer, 2016: 694-711. [17] IIZUKA S, SIMO-SERRA E, ISHIKAWA H. Let There Be Color!: Joint End-to-End Learning of Global and Local Image Priors for Automatic Image Colorization with Simultaneous Classification. ACM Transactions on Graphics, 2016, 35(4). DOI: 10.1145/2897824.2925974. [18] DUMOULIN V, SHLENS J, KUDLUR M. A Learned Representa- tion for Artistic Style[C/OL]. [2020-05-23]. https://arxiv.org/pdf/1610.07629.pdf. [19] KRIZHEVSKY A, SUTSKEVER I, Hinton G E. ImageNet Classification with Deep Convolutional Neural Networks // Proc of the 25th International Conference on Neural Information Processing Systems. Cambridge, USA: The MIT Press, 2012: 1097-1105. [20] HE K M, ZHANG X Y, REN S Q, et al. Deep Residual Learning for Image Recognition // Proc of the IEEE Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE, 2016: 770-778. [21] DONG C, LOY C C, HE K M, et al. Image Super-Resolution Using Deep Convolutional Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(2): 295-307. [22] WINNEMÖLLER H, KYPRIANIDIS J E, OLSEN S C. XDoG: An Extended Difference-of-Gaussians Compendium Including Advanced Image Stylization. Computers and Graphics, 2012, 36(6): 740-753. [23] XIE S N, TU Z W. Holistically-Nested Edge Detection. International Journal of Computer Vision, 2017, 125: 3-18. [24] WANG Z, BOVIK A C, SHEIKH H R, et al. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Transactions on Image Processing, 2004, 13(4): 600-612.